NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Investigating the sources of variable impact of pathogenic variants in monogenic metabolic conditions

https://doi.org/10.1038/s41467-025-60339-7

Wei, Angela; Border, Richard; Fu, Boyang; Cullina, Sinéad; Brandes, Nadav; Jang, Seon-Kyeong; Sankararaman, Sriram; Kenny, Eimear E; Udler, Miriam S; Ntranos, Vasilis; et al (December 2025, Nature Communications)

Abstract Over three percent of people carry a dominant pathogenic variant, yet only a fraction of carriers develop disease. Disease phenotypes from carriers of variants in the same gene range from mild to severe. Here, we investigate underlying mechanisms for this heterogeneity: variable variant effect sizes, carrier polygenic backgrounds, and modulation of carrier effect by genetic background (marginal epistasis). We leveraged exomes and clinical phenotypes from the UK Biobank and the Mt. Sinai BioMeBiobank to identify carriers of pathogenic variants affecting cardiometabolic traits. We employed recently developed methods to study these cohorts, observing strong statistical support and clinical translational potential for all three mechanisms of variable carrier penetrance and disease severity. For example, scores from our recent model of variant pathogenicity were tightly correlated with phenotype amongst clinical variant carriers, they predicted effects of variants of unknown significance, and they distinguished gain- from loss-of-function variants. We also found that polygenic scores modify phenotypes amongst pathogenic carriers and that genetic background additionally alters the effects of pathogenic variants through interactions.
more » « less
Free, publicly-accessible full text available December 1, 2026
dotears: Scalable and consistent directed acyclic graph estimation using observational and interventional data

https://doi.org/10.1016/j.isci.2024.111673

Xue, Albert; Rao, Jingyou; Sankararaman, Sriram; Pimentel, Harold (February 2025, iScience)

Free, publicly-accessible full text available February 1, 2026
Scalable summary-statistics-based heritability estimation method with individual genotype level accuracy

https://doi.org/10.1101/gr.279207.124

Jeong, Moonseong; Pazokitoroudi, Ali; Liu, Zhengtong; Sankararaman, Sriram (September 2024, Genome Research)

SNP heritability, the proportion of phenotypic variation explained by genotyped SNPs, is an important parameter in understanding the genetic architecture underlying various diseases and traits. Methods that aim to estimate SNP heritability from individual genotype and phenotype data are limited by their ability to scale to Biobank-scale data sets and by the restrictions in access to individual-level data. These limitations have motivated the development of methods that only require summary statistics. Although the availability of publicly accessible summary statistics makes them widely applicable, these methods lack the accuracy of methods that utilize individual genotypes. Here we present a SUMmary-statistics-based Randomized Haseman-Elston regression (SUM-RHE), a method that can estimate the SNP heritability of complex phenotypes with accuracies comparable to approaches that require individual genotypes, while exclusively relying on summary statistics. SUM-RHE employs Genome-Wide Association Study (GWAS) summary statistics and statistics obtained on a reference population, which can be efficiently estimated and readily shared for public use. Our results demonstrate that SUM-RHE obtains estimates of SNP heritability that are substantially more accurate compared with other summary statistic methods and on par with methods that rely on individual-level data.
more » « less
Full Text Available
A scalable adaptive quadratic kernel method for interpretable epistasis analysis in complex traits

https://doi.org/10.1101/gr.279140.124

Fu, Boyang; Anand, Prateek; Anand, Aakarsh; Mefford, Joel; Sankararaman, Sriram (September 2024, Genome Research)

Our knowledge of the contribution of genetic interactions (epistasis) to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving the power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank data sets or lack interpretability. We propose QuadKAST, a scalable algorithm focused on testing pairwise interaction effects (quadratic effects) within small to medium-sized sets of genetic variants (window size ≤100) on a trait and provide quantified interpretation of these effects. Comprehensive simulations show that QuadKAST is well-calibrated. Additionally, QuadKAST is highly sensitive in detecting loci with epistatic signals and accurate in its estimation of quadratic effects. We applied QuadKAST to 52 quantitative phenotypes measured in ≈300,000 unrelated white British individuals in the UK Biobank to test for quadratic effects within each of 9515 protein-coding genes. We detect 32 trait-gene pairs across 17 traits and 29 genes that demonstrate statistically significant signals of quadratic effects (accounting for the number of genes and traits tested). Across these trait-gene pairs, the proportion of trait variance explained by quadratic effects is comparable to additive effects, with five pairs having a ratio >1. Our method enables the detailed investigation of epistasis on a large scale, offering new insights into its role and importance.
more » « less
Full Text Available
A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits

https://doi.org/10.1016/j.ajhg.2024.05.015

Pazokitoroudi, Ali; Liu, Zhengtong; Dahl, Andrew; Zaitlen, Noah; Rosset, Saharon; Sankararaman, Sriram (July 2024, The American Journal of Human Genetics)

Full Text Available
Characterizing the genetic architecture of drug response using gene-context interaction methods

https://doi.org/10.1016/j.xgen.2024.100722

Sadowski, Michal; Thompson, Mike; Mefford, Joel; Haldar, Tanushree; Oni-Orisan, Akinyemi; Border, Richard; Pazokitoroudi, Ali; Cai, Na; Ayroles, Julien F; Sankararaman, Sriram; et al (December 2024, Cell Genomics)

Full Text Available
Personalized mood prediction from patterns of behavior collected with smartphones

https://doi.org/10.1038/s41746-024-01035-6

Balliu, Brunilda; Douglas, Chris; Seok, Darsol; Shenhav, Liat; Wu, Yue; Chatzopoulou, Doxa; Kaiser, William; Chen, Victor; Kim, Jennifer; Deverasetty, Sandeep; et al (December 2024, npj Digital Medicine)

Abstract Over the last ten years, there has been considerable progress in using digital behavioral phenotypes, captured passively and continuously from smartphones and wearable devices, to infer depressive mood. However, most digital phenotype studies suffer from poor replicability, often fail to detect clinically relevant events, and use measures of depression that are not validated or suitable for collecting large and longitudinal data. Here, we report high-quality longitudinal validated assessments of depressive mood from computerized adaptive testing paired with continuous digital assessments of behavior from smartphone sensors for up to 40 weeks on 183 individuals experiencing mild to severe symptoms of depression. We apply a combination of cubic spline interpolation and idiographic models to generate individualized predictions of future mood from the digital behavioral phenotypes, achieving high prediction accuracy of depression severity up to three weeks in advance (R²≥ 80%) and a 65.7% reduction in the prediction error over a baseline model which predicts future mood based on past depression severity alone. Finally, our study verified the feasibility of obtaining high-quality longitudinal assessments of mood from a clinical population and predicting symptom severity weeks in advance using passively collected digital behavioral data. Our results indicate the possibility of expanding the repertoire of patient-specific behavioral measures to enable future psychiatric research.
more » « less
Full Text Available
Fast kernel-based association testing of non-linear genetic effects for biobank-scale data

https://doi.org/10.1038/s41467-023-40346-2

Fu, Boyang; Pazokitoroudi, Ali; Sudarshan, Mukund; Liu, Zhengtong; Subramanian, Lakshminarayanan; Sankararaman, Sriram (December 2023, Nature Communications)

Abstract Our knowledge of non-linear genetic effects on complex traits remains limited, in part, due to the modest power to detect such effects. While kernel-based tests offer a versatile approach to test for non-linear relationships between sets of genetic variants and traits, current approaches cannot be applied to Biobank-scale datasets containing hundreds of thousands of individuals. We propose, FastKAST, a kernel-based approach that can test for non-linear effects of a set of variants on a quantitative trait. FastKAST provides calibrated hypothesis tests while enabling analysis of Biobank-scale datasets with hundreds of thousands of unrelated individuals from a homogeneous population. We apply FastKAST to 53 quantitative traits measured across ≈ 300 K unrelated white British individuals in the UK Biobank to detect sets of variants with non-linear effects at genome-wide significance.
more » « less
Full Text Available
Leveraging family data to design Mendelian Randomization that is provably robust to population stratification

https://doi.org/10.1101/gr.277664.123

LaPierre, Nathan; Fu, Boyang; Turnbull, Steven; Eskin, Eleazar; Sankararaman, Sriram (May 2023, Genome Research)

Mendelian Randomization (MR) has emerged as a powerful approach to leverage genetic instruments to infer causality between pairs of traits in observational studies. However, the results of such studies are susceptible to biases due to weak instruments as well as the confounding effects of population stratification and horizontal pleiotropy. Here, we show that family data can be leveraged to design MR tests that are provably robust to confounding from population stratification, assortative mating, and dynastic effects. We demonstrate in simulations that our approach, MR-Twin, is robust to confounding from population stratification and is not affected by weak instrument bias, while standard MR methods yield inflated false positive rates. We then conducted an exploratory analysis of MR-Twin and other MR methods applied to 121 trait pairs in the UK Biobank dataset. Our results suggest that confounding from population stratification can lead to false positives for existing MR methods, while MR-Twin is immune to this type of confounding, and that MR-Twin can help assess whether traditional approaches may be inflated due to confounding from population stratification.
more » « less
Full Text Available
Robust Mendelian randomization in the presence of residual population stratification, batch effects and horizontal pleiotropy

https://doi.org/10.1038/s41467-022-28553-9

Cinelli, Carlos; LaPierre, Nathan; Hill, Brian L.; Sankararaman, Sriram; Eskin, Eleazar (December 2022, Nature Communications)

Abstract Mendelian Randomization (MR) studies are threatened by population stratification, batch effects, and horizontal pleiotropy. Although a variety of methods have been proposed to mitigate those problems, residual biases may still remain, leading to highly statistically significant false positives in large databases. Here we describe a suite of sensitivity analysis tools that enables investigators to quantify the robustness of their findings against such validity threats. Specifically, we propose the routine reporting of sensitivity statistics that reveal the minimal strength of violations necessary to explain away the MR results. We further provide intuitive displays of the robustness of the MR estimate to any degree of violation, and formal bounds on the worst-case bias caused by violations multiple times stronger than observed variables. We demonstrate how these tools can aid researchers in distinguishing robust from fragile findings by examining the effect of body mass index on diastolic blood pressure and Townsend deprivation index.
more » « less
Full Text Available

« Prev Next »

Search for: All records